Surveying soil-borne disease development on wild rocket salad crop by proximal sensing based on high-resolution hyperspectral features

Wild rocket (Diplotaxis tenuifolia, Brassicaceae) is a baby-leaf vegetable crop of high economic interest, used in ready-to-eat minimally processed salads, with an appreciated taste and nutraceutical features. Disease management is key to achieving the sustainability of the entire production chain in intensive systems, where synthetic fungicides are limited or not permitted. In this context, soil-borne pathologies, much feared by growers, are becoming a real emergency. Digital screening of green beds can be implemented in order to optimize the use of sustainable means. The current study used a high-resolution hyperspectral array (spectroscopy at 350–2500 nm) to attempt to follow the progression of symptoms of Rhizoctonia, Sclerotinia, and Sclerotium disease across four different severity levels. A Random Forest machine learning model reduced dimensions of the training big dataset allowing to compute de novo vegetation indices specifically informative about canopy decay caused by all basal pathogenic attacks. Their transferability was also tested on the canopy dataset, which was useful for assessing the health status of wild rocket plants. Indeed, the progression of symptoms associated with soil-borne pathogens is closely related to the reduction of leaf absorbance of the canopy in certain ranges of visible and shortwave infrared spectral regions sensitive to reduction of chlorophyll and other pigments as well as to modifications of water content and turgor.


Materials and methods
The present study was conceptualized and conducted based on two independent experiments, named "guide experiment" (Exp_Gui) and "applicative experiment" (Exp_App), following the workflow shown in Fig. S1.
Leaf and canopy reflectance data (see the "Reflectance measurements" section) were collected from soil-inoculated wild rocket plants grown in controlled environments. From Exp_Gui we obtained leaf-scale reflectance data. The resulting dataset was used for calibration and validation purposes, and then divided into a "training dataset" and a "testing dataset" (see the "Data preparation and analysis" section), to model spectral responses at increasing levels of disease severity, and to develop new VIs capable of monitoring aerial symptoms of soil-borne diseases in wild rocket. Reflectance data at canopy scale were acquired from Exp_App. The resulting dataset, called the "canopy dataset", was used to evaluate the applicability of the newly developed VIs using plant growth and data acquisition conditions comparable to those normally applicable, or expected to be applicable, in real growth environments for early detection of fungal diseases.

Fungal pathogens and inoculum preparation. Virulent isolates of soil-borne fungal pathogens R.
solani (RhS) AG-4, S. rolfsii (ScR), and S. sclerotiorum (ScS) used in this study were obtained from the collection of fungi maintained on potato dextrose agar (PDA, Difco Laboratories, Detroit, Mich.) slants at 20 °C at the Research Centre for Vegetable and Ornamental Crops, Council for Agricultural Research and Economics (CREA), located in Pontecagnano Faiano, Italy. They were previously tested for pathogenicity on wild rocket; stock cultures were stored on PDA slants at 4 °C until use. For each pathogen, the artificial inoculum was prepared by infecting with 20 mycelia plugs (0.5 cm diameter) 500 g millet, previously autoclaved (22 min., 121 °C) and saturated (v/v) with 0.1 × potato dextrose broth (PDB, Difco Laboratories, Detroit, Mich.). After 21 days of incubation in the dark at 25 °C, the infected millet was ready for use after air-drying.

Experiment description.
In both experiments, wild rocket (D. tenuifolia) cv. Tricia (Enza Zaden, Italy) was sown in black plastic pots (140 mm diameter) filled with autoclaved peat substrate (Klasmann-Deilmann GmbH, Geeste, Germany), at a density of 5 plants pot −1 , and grown in a nursery glasshouse to obtain 1-monthold plants. Then, plants were transferred to benches in a climatic chamber at 25 ± 2 °C with a 12-h photoperiod (Exp_Gui) or in an experimental greenhouse (Exp_App). Daily irrigation with dechlorinated sterile water was performed to maintain water holding capacity around 80% of nominal, while a basal NPK mix liquid fertilization was applied twice a week.
For both Exp_Gui and Exp_App, three disease conditions (RhS-D, ScR-D, and ScS-D, disease) were imposed, each at three different levels of disease severity plus a healthy not-inoculated treatment. Each treatment consisted of a total of 10 pots (replicates) for each combination. The different levels of disease severity were obtained by www.nature.com/scientificreports/ inoculating different groups of plants at 10,7, and 3 days before sampling, i.e., spectral acquisitions. Pots were inoculated by incorporating 1 g (dry weight) pot −1 of infected millet in the first layer (5 cm depth) of the substrate. Eventually, each pathosystem consisted of 40 pots (200 plants), characterized by different levels of disease severity for a total of 120 pots (600 plants) in the whole trial. Leaves from Exp_Gui were visually assessed for their disease severity with a score on a 0-3 scale. For pots from Exp_App a visual inspection was also assessed by using a 0-3 rating scale according to the mean symptoms of foliage disease, for each pathosystem. Leaves/pots were classified as: healthy (Disease class 0), and diseased at early (Disease class 1), moderate (Disease class 2), and severe (Disease class 3) degree, according to Manganiello et al. 24 . Disease class 0 was observed only from the not inoculated healthy controls (Fig. 1).

Reflectance measurements.
In Exp_Gui, leaf reflectance (350-2500 nm) was measured by contact under laboratory conditions with a portable non-imaging spectroradiometer (FieldSpec ® 4 Hi-Res, ASD Inc., Boulder, CO, USA) using an optical fiber contact probe (ASD Plant Probe; ASD Inc., USA) with a 10 mm field of view and an integrated halogen reflector lamp, which minimized atmospheric interference on the recorded signal (active sensor). To increase the quality and homogeneity of the acquired data, the instrument was warmed up for 90 min before measurement. The pre-calibrated Spectral 99% white reference panel was used for calibration, run every 5 min during data acquisition. Each sample scan represented an average of 20 reflectance spectra. Leaves were randomly selected from pots/plants characterized by different disease levels, resulting in a homogenous sampling across disease severity classes. A total of 216, 212, and 222 reflectance acquisitions were obtained for RhS-D, ScR-D and ScS-D, respectively. Leaves were removed from the plants and spectral reflectance was measured immediately on-site on their adaxial side.
For Exp_App, the canopy reflectance was assessed outdoors under full sunlight conditions using the ASD FieldSpec ® 4 Hi-Res (ASD Inc., Boulder, CO, USA). All spectra were obtained using the spectroradiometer with a 25° field of view (FOV). Potted plants were placed in a fixed plate on a black background, maintaining a vertical distance between the sensor and the top of the plants of 10 cm (passive sensor). Three measurements per pot were conducted to obtain an average spectrum in the "canopy dataset" (n = 40 for each disease).
Dry weight, leaf area index. Plants in Exp_App were also evaluated for dry weight (DW, g pot −1 ), leaf area (LA, cm 2 pot −1 ), and dry matter content (DM, %) of aerial biomass. Plants were sampled and transferred to the laboratory for aerial fresh and DW determinations, after drying them in an oven at 70 °C until constant weight. Prior to drying, LA was recorded using LI-3100 area meter (Li-Cor Corporation, Lincoln, NE, USA).
Data preparation and analysis. After removing the noisy bands at the extreme wavelengths, the spectral range from 400 to 2450 nm (2051 wavelengths) was analysed. For both leaf and canopy reflectance data, preprocessing involved splice correction to remove the abrupt changes in the recorded values of the neighbouring wavelengths, which appear at the spectral intervals between the three integrated sensors of ASD FieldSpec spectroradiometer. Splice correction was achieved using the commercial software package ASD-View Spec Pro (ASD Inc., Boulder, CO, USA).
Leaf reflectance was assessed under constant light and temperature conditions, so further pre-processing to smooth the spectrum and reduce signal noise was not performed 32 . Prior to spectral statistical analyses, leaf reflectance was pre-processed using first derivative and mean-centering, in order to reduce the baseline offset by amplifying small spectral features 33 . Data pre-treatments was performed with the ParLeS software 34 . For each disease, data were then randomly divided into two datasets by splitting the observations (approximately 70:30) www.nature.com/scientificreports/ into training (calibration) and test (validation) datasets (training set, n = 150, 149, and 160 for RhS-D, ScR-D and ScS-D, respectively; testing set, n = 66, 63, and 62 for RhS-D, ScR-D and ScS-D, respectively). The RF supervised machine learning algorithm 35 , a classification algorithm consisting of an ensemble of decision trees, was used by applying R Caret package 36 to select the most predictive bands to be used in the construction of the VIs as well as to evaluate the VIs performance in discriminating among the four classes for each disease. To this purpose, a two-step strategy was applied using two types of continuous variables, i.e., (i) reflectance features and (ii) VIs derived from Exp_Gui, to classify the categorical variables (Disease Classes 0, 1, 2, 3) through a set of decision tree process to develop the RF predictive models. The complexity of the input variables was then reduced gradually by selecting at each step the features (bands or VIs) with the highest predictive power through the VSURF R package 37 and testing the performance of the restricted variable sets using the Confusion Matrix (R Caret package) and Kappa coefficient. In total, 4 RF models were obtained using: for the first step, (1) all the wavelengths (All λ) and (2) only the selected wavelengths (Selected λ); for the second step, (3) all the derived VIs (All VIs) and (4) only the selected VIs (Selected VIs).
The VIs related to disease severity classes were calculated from the selected wavelengths in the most important regions of the full spectrum. For RhS-D, ScR-D and ScS-D, wavelength selection was performed during the first RF step.
Two band combinations of the spectral indices, Normalized Difference Spectral Indices (NVIs) and Simple Ratio Indices (SRs) were generated for all the possible combinations of the selected wavelengths following the equations: where reflectance(wavelength i ) and reflectance(wavelength j ) are the spectral reflectance values of two random wavelengths (λi and λj, respectively).
The newly obtained NVIs and SRs were correlated (Spearman rank correlation) 38 with the disease severity levels for each pathogen, to identify the most suitable indicators associated with the identification and development of the pathogenesis.
Finally, newly developed VIs strongly correlated with treatments were tested using the canopy dataset, obtained by canopy reflectance measurements from Exp_App. For each disease, Spearman rank correlations were applied with R software 39 to assess the correlation between the VIs and disease severity (at the pot scale) as well as between the VIs and LA, DW and DM, all performed at the pot scale. NVIs and SRs were calculated from the raw spectra after splice correction, as previously indicated. One-way ANOVA followed by the post hoc Least Significant Difference test (LSD) was applied to assess statistical relationships among VIs values over the disease classes using the R package agricolae 40 .
Machine learning modelling, correlations and ANOVA were performed using the statistical software R version 4.0.2 39 .
Ethics approval. The experimental research and field studies on plants, including the collection of plant material, complied with relevant institutional, national, and international guidelines and legislation.

Results
Disease index and plant growth. All plants in the infected pots developed disease symptoms, as expected, with varying severity depending on the specific pathogen and timing of inoculation. ScR proved to be the most aggressive pathogen producing the basal hydropic lesions immediately after the first day of interaction. Overall, all diseases progressed from hydropic halos/lesions on the plant collar and basal parts of the petioles to rot, wilt, and death of the plant. ScR formed the typical mycelial felt on the targeted basal organs and at a late stage produced the typical light brown spherical sclerotia. On the other hand, ScS signalled its presence with a slight production of white mould, whereas RhS did not burst.
Based on Exp_App, the effects of soil-borne pathogens on plant growth were monitored. Fungal diseases significantly affected plant DW, LA and DM, as they progressed through the four severity levels (Fig. 2).

Hyperspectral fingerprints. Typical reflectance signatures of diseased leaves of wild rocket inoculated
with RhS, ScR, and ScS are shown in Fig. 3A-C, respectively. Regardless of the fungal pathogen, the reflectance of leaves of infected plants tended to increase abruptly in the VIS region with advancing disease severity. At the highest disease severity level, the profile of the spectral signature was also affected, especially between 550 and 700 nm, resulting in the altered redshift characteristics and red-edge positions. Despite differences among the pathogens, the reflectance of the most infected leaves decreased in the NIR regions and increased again from around 1400 nm, without affecting the signature patterns (Fig. 3).

Selection of the features by Random Forest modelling.
In order to select the most discriminant wavelengths for each pathogen, the RF supervised machine learning algorithm was trained and validated using, respectively, 70 and 30% of the input observations from Exp_Gui. The predictive performance of the models (All λ) is highlighted in each related Caret's Confusion Matrix and related statistics shown in Table 1. RF ran www.nature.com/scientificreports/ www.nature.com/scientificreports/ efficiently on the first derivative reflectance data showing a stable prediction of disease levels compared to the healthy Control, already at the early stages of infection. The higher accuracy level was recorded for ScR-D (0.726) followed by RhS-D and ScS-D (0.625, and 0.500, respectively); these accuracy levels were significantly greater than the no information rate (p-value [A > N] < 0.05). The R package VSURF was then implemented on the obtained RF models in order to remove irrelevant information by the recursive elimination of the smallest important wavelengths and so leading to a subset of explanatory variables highly related to the model response, by minimizing the prediction error. This variable selection method returns a smaller subset of variables focusing more closely on the RF prediction objective avoiding redundancy. The second RF model (Selected λ) was then constructed for each pathogen. In this case, we www.nature.com/scientificreports/ observed high levels of the proportion of correctly classified cases as well as good metrics, indicating substantial invariance in the predictive value of the models compared to the previous benchmark and so demonstrating the efficiency of band selection (Table 1). In particular, an increase of accuracies of RF models for RhS-D and ScS-D (0.762 and 0.546, respectively), and a decrease for the ScR-D ones, were observed. Moreover, the Balanced Accuracy predicted better the two extreme disease classes (Class 0 and 3)-with values > 0.75-than the intermediate ones (Class 1 and 2), both when all wavelengths were used as entry variables of models and when the RF models were constructed using only the selected wavelengths (Tables S1, S2, S3). This also emerges from the confusion matrices where the misclassified observations tend to zero as we move away from the diagonal (Table 1). For each telluric pathogenic disease, the selected wavelengths from RF models, most related to the classification response, are reported in Table 2. These bands fell in three regions of the full spectrum, mainly in the VIS-NIR and very few in the SWIR. A total of 9, 16, and 12 wavelengths for RhS-D, ScR-D, and ScS-D, respectively, were selected and combined to build 135, 72, 99 both SRs and NVIs, respectively, for each disease (data not shown). It is worth noting that, regardless of the pathogen, the models selected neighbouring wavelengths in the spectrum (e.g., 559/560 nm; 601/602 nm; around 665 nm; around 724 nm). Some differences were observed for ScR-D in the VIS (i.e., 447 and 854 nm) and, in general among diseases, in the SWIR region.
The so obtained VIs were used for the construction of the third RF classification model (All VIs) for each disease. Indeed, thanks to the flexibility of the RF algorithm, all VIs can be used to get a more accurate prediction of the RhS-D, ScR-D and ScS-D levels. The related confusion matrix showed general accuracy predictions of all VIs based RF models that stood at 0.656, 0.645, and 0.530 for RhS-D, ScR-D, and ScS-D, respectively (Table 3).  0  1  2  3  0  1  2  3  0  1  2  3   0  12  9  3  0  0  16  4  2  0  0  14  9  4  1   1  2  10  8  0  1  2  11  5  0  1  2  3  1  1   2  0  1  4  1  2  0  2  8  0  2  1  6  6  4   3  0  0  0  14  3  0  1  1  10  3  0  2  2 Table 2. Wavelengths (nm) selected from reflectance data recorded on wild rocket leaves inoculated with Rhizoctonia solani (RhS-D), Sclerotium rolfsii (ScR-D), and Sclerotinia sclerotiorum (ScS-D) (Exp_Gui). The wavelenghts were selected starting from RF models by the recursive elimination of the smallest important ones. VIS-NIR = visible-near infrared regions; SWIR = short-wave infrared region. www.nature.com/scientificreports/ These models were propaedeutic for the most effective classifier VIs selection; indeed, the subsequent VSURF step allowed the selection of the 12, 10, and 10 most suitable indices, listed in Table 4 and described in the next section, to model disease severity for their ability to classify via RF algorithm the RhS-D, ScR-D and ScS-D. To confirm their consistency for the mentioned objective, the newly obtained VIs fed the fourth RF model (Selected VIs), giving metrics shown in Table 3 as well as in Tables S1, S2, and S3.
The applicability of those VIs was tested using the canopy dataset from Exp_App. For each pathogen, VIs were calculated starting from canopy reflectance data at the pot scale (Fig. 5). Water absorption bands and noisy spectral regions over 1351-1409 nm and 1801-1949 nm were excluded from the canopy spectral signature and so the NVI 601-1908 (ScS-D) was not calculated.
The identified VIs showed a significant relationship (based on Spearman correlation coefficients) with the disease classes of severity (Table 5). RhS-D maintained 11 VIs showing high correlation coefficients with disease levels (ranging from 0.60 to 0.65) but characterized by a good correlation (Pearson coefficient) with some important parameters of crop growth (i.e., LA and DM) ( Table 5). The best results were achieved by the NVIs and SRs Table 3. Confusion Matrices and overall statistics of Random Forest models trained and validated on (3) all the VIs generated by the combinations of the selected wavelengths (All VIs) and on (4) the selected VIs (selected VIs)-reflectance data recorded from wild rocket leaves inoculated with Rhizoctonia solani (RhS-D), Sclerotium rolfsii (ScR-D), and Sclerotinia sclerotiorum (ScS-D) at four disease severities (Disease class 0, 1, 2 and 3) (Exp_Gui). VIs were selected by the recursive elimination of the less important indices to obtain a subset of explanatory variables highly related to the model response. Predicted classes   0  1  2  3  0  1  2  3  0  1  2  3   0  13  12  2  0  0  15  8  3  0  0  13  8  3  1   1  1  7  2  1  1  3  8  4  0  1  4  5  2  1   2  0  1  9  1  2  0  2  7  0  2  0  5  8  5   3  0  0  2  13  3  0  0  2  10  3  0  2  0  www.nature.com/scientificreports/ reporting the reflectance value at 559 and 602 nm. ScR-D presented the better performing indices among all the investigated pathogens. Those VIs were able to discriminate infection and well related to all measured growth parameters, including the plant DW at harvest. Finally, for ScS-D it was possible to maintain only three indices, characterized by a low correlation with disease severity, LA, and DM (Table 5).

Discussion
Greenhouse cultivation of leafy vegetables such as wild rocket on large areas requires continuous disease monitoring for the ready identification of the spatial/temporal distribution of disease symptoms within the crop, which allows increasing effectiveness (less quantities) and targeting (more precision) of preventive and/or curative control means. For this purpose, proximal detection methods using optoelectronic probes have proved suitable to feed farmers' decision-making processes by ensuring automatic, rapid, non-destructive, and large-scale disease assessment, and reducing the reaction time to the early outbreak 22 .
In this study, a high-resolution hyperspectral array [i.e., VIS/infrared (IR) spectroscopy] was used to attempt to follow the progression of R. solani, S. rolfsii, and S. sclerotiorum disease symptoms on D. tenuifolia through four different severity levels. To the best of our knowledge, this is the first study that investigates the responses of wild rocket to soil-borne fungal pathogens in terms of spectral reflectance. Starting from the full spectral signature we introduced and calculated new vegetation indices aimed at reducing the dimensionality of the dataset, suitable to discriminate between healthy and infected classes of wild rocket leaves, which could be effectively transferable to real (or real like) growing conditions, in order to accelerate disease detection. The methodological information obtained in the present study can be transferred to other (leaf) crop-(soil-borne) pathogen systems. Table 4. Hyperspectral vegetation normalized difference (NVI) simple ratio (SR) indices (dimensionless) used in this study for wild rocket disease detection. Spearman correlation coefficients between disease severities (Disease classes 0, 1, 2, 3) and the newly developed vegetation indices calculated from reflectance data recorded on rocket leaves inoculated with Rhizoctonia solani (RhS), Sclerotium rolfsii (ScR), and Sclerotinia sclerotiorum (ScS) (Exp_Gui). *p > 0.05; **p < 0.01. www.nature.com/scientificreports/ www.nature.com/scientificreports/ Modelling was performed on an experiment conducted under controlled conditions (i.e., growth chamber) and reflectance data were collected in the laboratory (at leaf scale) using a specific light source 41 , thus resulting in high-resolution and high-precision measurements. As expected, fungal diseases influenced the spectral signature: from a qualitative point of view, healthy plants showed low reflectance values in the VIS and SWIR regions as well as high reflectance values in the NIR, highlighting changes in physiological and biological parameters in responses to pathogenesis, i.e., reduction in chlorophyll content, changes in leaf water content, and degeneration of internal leaf structure, respectively [42][43][44] . In agreement with previous results 41 , a slight blue-shift in the position of the red-edge due to the symptoms of developing necrosis was also observed. However, slight differences among pathogens were detected (e.g., ScR-D near 1700 nm). This deserves further investigation as specific pathosystems are characterized by specific spatiotemporal dynamics, as reported in sugar beet 32 , tomato 45 , and potato 46 . It is worth specifying that the soil-borne diseases studied generally allow comparable symptomatology, despite differences in terms of disease progression, and their impact on foliage is almost indirect.
The latter aspect might also have been related to the ability of the RF models to discriminate healthy and highly infected leaves with higher accuracy, in comparison with intermediate disease severity, as previously observed in similar studies performed with Propagation Neural Network spectral calibration models on blackleg potato that reduced accuracy on whole plants in field trials 25 . The RF algorithm represents one of the most common statistical machine learning methods used in modelling and classifying qualitative classes of disease severity from hyperspectral data and has been successfully employed in many host-pathogen combinations 41,47 . Recently, the RF algorithm has been applied by our group to model the detection of powdery mildew and tracheofusariosis in wild rocket 48,49 . However, since the continuous narrowband datasets contain redundant spectral information, the selection of significant wavebands was performed, based on the results of RF decision trees 50 .
The main objective of this study was to obtain new spectral VIs, starting from the selected wavelengths in the three full-spectrum regions (i.e., VIS, NIR and SWIR), useful in the scaling up and speeding up of disease detection 51 in wild rocket plants under greenhouse or open-field conditions. In conceptual agreement with the described pipeline, similar hyperspectral data mining was previously conducted to identify the best indices with high healthy/diseased separation capacity of late blight in tomato 52 , Cercospora leaf spot in beet 53 , powdery mildew in squash 29 , bacterial wilt in peanut 27 , Yellow Rust in wheat 54 , and Alternaria leaf blight in potato 55 .
Regardless of fungal diseases, the new calculated VIs (at leaf scale) were generally effective in separating healthy from infected leaves; however, some VIs could also discriminate intermediate disease intensities. Interestingly, in the case of ScR-D, which seemed to be characterized by the highest hyperspectral spectroscopy Table 5. Spearman correlation coefficients for disease severities (Disease classes 0, 1, 2, 3) and Pearson correlation coefficients for dry weight (DW, g plot −1 ), leaf area (LA, cm 2 plot −1 ), and dry matter (DM, %) and the newly developed vegetation indices (normalized difference indices: NVI; simple ratios: SR-listed in Table 4) calculated from reflectance data recorded on wild rocket inoculated with Rhizoctonia solani (RhS-D), Sclerotium rolfsii (ScR-D), and Sclerotinia sclerotiorum (ScS-D) (Exp_App). Reflectance data were acquired at canopy scale (named as crop dataset; degrees of freedom = 78), so that was not possible to calculate all the newly developed vegetation indices (see the text for further information). Only indices showing significant Spearman correlation coefficients for DI variable were reported. *p > 0.05; ns non-significant. www.nature.com/scientificreports/ performances, probably due to its greater aggressiveness, the selected NVIs and SRs indices are based on green and light red bands highlighting the effects of disease on leaf pigments concentrations as well as on photosynthetic efficiency 56 . The SR 1377-664 was characterized by a good ability to separate disease intensities associated with RhS-D, confirming the role of IR bands to estimate cell damages. In particular, the spectral region centered around 1377 nm represented a key region in the identification of physically damaged mushrooms during the monitoring of their quality 57 . Moreover, for RhS-D, the selected wavelengths in green and orange regions, at 559 and 602 nm, determined high-performance VIs. Those wavelengths have been found ascribed to the leaf chlorophyll content. In particular, λ559, which is normally ignored in most VIs, has been found to be associated with yield and biomass accumulation 58 .
To verify the transferability of de novo VIs, they were tested against the canopy dataset. Close to the spectral regions of the VIs in the present study, MCARI has been previously found to be responsive to the early stage of disease development of Cercospora leaf spot and rust on sugar beet 32 , to tomato chlorophyll content and LAI reduction as affected by bacterial wilting 59 , and to asymptomatic shoot status in presence of abiotic root stress in pepper 60 . On the other hand, the indices estimating pigments, Plant Pigment Ratio and Carotenoid Indices, based respectively on range 550-450 and 500-570 nm, have been, respectively, used to predict plant N status and chlorophyll 61 and carotenoid levels as a proxy of plant stress 62 .
As a matter of fact, the ongoing symptoms associated with soil-borne pathogens, i.e., losses of chlorophyll and other pigments, discoloration, reduced water content, wilting and, finally, desiccation, are closely linked to reduced leaf absorbance in the canopy 63 . In sugar beet, Reynolds et al. 64 calculated two narrowband hyperspectral vegetation indices, such as Pigment Specific Simple Ratio (working on chlorophyll a) and modified Spectral Ratio (close to canopy reflectance at 445, 705 and 750 nm), for early detection of Rhizoctonia crown and root rot, while chlorophyll/carotenoids dependent SRPI (simple ratio pigment index) was proposed to discriminate the soil borne disease from Heterodera schachtii nematode symptoms 65 . Huang and Apan 66 found VIS-NIR narrow range wavelengths valuable to predict the incidence of Sclerotinia rot disease in celery. For most of the literature about VIs, the discrimination between healthy and severely diseased plants leaves is attributable to the strong changes that occurred in the later soil-borne infection stages 24 . The relevance of these indices seems to be limited to the drastic exclusion of infected plants that can be successfully practised, whereas, in the late stage, disease progression is too advanced to apply preventive strategies.
In the present study, the newly formulated VIs at high-resolution and low-throughput levels (leaf scale) were tested directly on an independent dataset (high-throughput experiment), built on reflectance measurements performed on the canopy, confirming their transferability to ordinary growing and measuring conditions. Interestingly, for all diseases a good correlation between computable VIs and LA was observed.

Conclusions
Our study underlines the importance of scaling up the results of reflectance spectroscopy, performed at leaf scale with a contact probe, under controlled conditions and thus without by environmental factor effects (i.e., light intensity and direction, temperature, and canopy architecture), to larger-scale experiments (canopy scale), characterized by increasingly high-throughput and important for practical applications.
New vegetation indices able to discriminate disease stages were obtained. The most interesting wavelengths fell in the visible and infrared spectral regions, including interesting regions/wavelengths already investigated and previously identified under stress conditions. Extraction of the most significant spectral information from the big dataset is crucial to simplify the production of optoelectronic sensors, suitable for plant disease detection 41 specifically aiming to cheap and innovative surveying devices, to support sustainable disease management in greenhouse or open-field conditions. On perspective, this technology, possibly integrated with different sensors (i.e., thermal sensors and/or 3-D shape sensors) 67 , can improve pathogen detection, since the availability of significant spectral information linked to machine vision-based remote sensing makes feasible the real-time mapping of disease management.